feat: implement local ML stem separation with chunking#111
Conversation
This commit introduces the AudioStemSeparator class to perform local audio separation using PyTorch and demucs (or torchaudio implementations). It handles chunked inference to prevent OOM errors, includes extensive mock testing to maintain 100% test coverage, and integrates with the CLI's main analyze command. It also updates the supply chain inventory for the newly added dependencies (torch, torchaudio, torchvision).
|
Warning Review limit reached
More reviews will be available in 1 hour, 3 minutes, and 51 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more credits in the billing tab to continue. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (9)
📝 Walkthrough전체 설명이 변경사항은 분석 엔진에 Demucs 기반 오디오 스템 분리 기능을 통합합니다. 새로운 변경 사항
시퀀스 다이어그램sequenceDiagram
participant CLI as CLI Handler
participant Librosa as Librosa
participant Separator as AudioStemSeparator
participant Demucs as Demucs Model
participant Torch as Torch/Device
CLI->>Librosa: load(audio_path, sr=44100, mono=False)
Librosa-->>CLI: audio_data, sr
CLI->>Separator: separate_audio(audio_data, sr, segment_seconds=2.0)
Separator->>Demucs: _load_model("htdemucs")
Demucs-->>Separator: model instance
Separator->>Separator: eval mode, shape normalization
Separator->>Torch: check CUDA/MPS availability
Torch-->>Separator: device selection (cuda/mps/cpu)
Separator->>Demucs: apply_model(mix, split=True, overlap=0.25, segment=2.0)
Demucs->>Torch: inference under no_grad()
Torch-->>Demucs: separated stems tensor
Demucs-->>Separator: stems (batch, channels, samples)
Separator->>Torch: convert to CPU numpy
Torch-->>Separator: dict[stem_name, np.ndarray]
Separator-->>CLI: {vocals, bass, drums, other}
CLI->>CLI: log stems, continue analysis
예상 코드 리뷰 노력🎯 3 (Moderate) | ⏱️ ~25 분 시
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Warning Billing warning: we have not been able to collect payment for this subscription for more than 72 hours. Please update the payment method or pay any pending invoices in Billing to avoid service interruption. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/build-baseline.yml:
- Around line 224-226: The workflow step named "Sync Python dependencies"
currently skips macOS x86_64 by using the if-condition; revert that skip so the
uv sync --project services/analysis-engine --group dev --frozen runs on macOS
x86_64 (remove or adjust the if guard) and instead address the PyTorch wheel
issue by adding a CI-friendly workaround: either pin a compatible PyTorch wheel
in the analysis-engine dependency config (pyproject/requirements) or add
commands in the workflow to install/build a macOS x86_64-compatible PyTorch
(e.g., install a compatible wheel or build from source) before running uv sync;
do not document this as a policy exception — keep the sync step enforced per the
cross-platform build policy.
In `@services/analysis-engine/src/bandscope_analysis/cli.py`:
- Around line 94-106: The stem separation block must be executed only after
request validation and only when the explicit opt-in flag is set; move the
librosa.load / AudioStemSeparator logic so it runs after your validator returns
success and behind a check for the CLI/handler flag (e.g. --separate-stems or a
boolean parameter separate_stems), and keep the existing try/except around the
heavy work (librosa.load, AudioStemSeparator(), separator.separate_audio) so
malformed payloads can't trigger file access or model inference before
validation; specifically, gate the use of audio_path, librosa.load,
AudioStemSeparator, and separator.separate_audio on validation success and
separate_stems == True.
- Around line 102-106: The stem-separation step currently swallows all
exceptions and never uses the produced stems (stems is only logged), so
separation can fail silently or be wasted on success; update the flow so that
AudioStemSeparator().separate_audio(...) returns are passed into the downstream
call (e.g., include stems in the payload or call run_analysis_job(request,
stems=stems)) when successful, and do not convert all errors to warnings — catch
only expected exceptions or log the full error and propagate/fail the job (raise
or return an error status) instead of continuing as success; update the
try/except around separate_audio to call run_analysis_job with the stems
variable on success and to surface failures (or use a controlled mock fallback
with an explicit flag) rather than a silent logging.warning.
In
`@services/analysis-engine/src/bandscope_analysis/separation/audio_separator.py`:
- Around line 38-46: 현재 _load_model 호출은
demucs.pretrained.get_model(self.model_name)를 직접 사용해 런타임에 원격 다운로드와 무결성 검증을
우회하므로, 로컬에 사전 프로비저닝된 weight와 체크섬을 우선 검증한 후 없으면 명시적으로 실패하도록 수정하세요: update
_load_model to first resolve a project-local artifact path for self.model_name
(or a configured model directory), verify the file exists and validate its
checksum/signature, only then load the model (and call .eval()); remove or gate
any fallback to demucs.pretrained.get_model that would perform a network fetch
so that cold-cache runs fail fast with a clear error rather than downloading
remotely.
In `@services/analysis-engine/tests/test_audio_separator.py`:
- Around line 42-64: The test currently only verifies apply_model was called but
not that it received the OOM-mitigation parameters; update the test in
test_audio_separator.py to assert apply_model was called with the expected
kwargs by inspecting mock_apply_model.call_args.kwargs after calling
separator.separate_audio (use the existing mock_apply_model and
separator.separate_audio/segment_seconds), and assert kwargs["split"] is True,
kwargs["segment"] == 2.0 (or segment_seconds), kwargs["overlap"] == 0.25,
kwargs["shifts"] == 1, and kwargs["progress"] is False; keep the existing checks
for call counts and returned stems.
In `@services/analysis-engine/tests/test_cli.py`:
- Around line 377-443: The test
test_cli_main_temporal_analyzer_and_separator_mock_success currently only checks
jobId and can pass without calling librosa.load or
AudioStemSeparator.separate_audio; update the test to assert those functions are
actually invoked with expected args: replace or wrap the current monkeypatch for
librosa.load and
bandscope_analysis.separation.audio_separator.AudioStemSeparator with
spies/mocks that record call counts and parameters, then after cli.main() assert
librosa.load was called with the expected path/sr/mono/duration and that
AudioStemSeparator.separate_audio was called with the expected audio,
sample_rate and segment_seconds; also add a complementary test (or extend this
one) exercising the --separate-stems gate to confirm separation is skipped when
the flag is absent and that separate_audio is not called.
In `@supply-chain/supplemental-component-inventory.json`:
- Around line 14-23: The htdemucs modelArtifacts entry is missing checksum
verification data; add a checksum field (and algorithm if your schema supports
it, e.g., checksumAlgorithm or checksumType) to the htdemucs object so releases
can validate cached weights; locate the "modelArtifacts" array and the entry
with "name": "htdemucs" (and "version": "demucs-v4.0", "sourceUrl": "...") and
include the checksum value and algorithm consistent with other artifacts in this
inventory.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: d034c4c0-df4d-4382-ac78-7aa825c1dea5
⛔ Files ignored due to path filters (1)
services/analysis-engine/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (7)
.github/workflows/build-baseline.ymlservices/analysis-engine/pyproject.tomlservices/analysis-engine/src/bandscope_analysis/cli.pyservices/analysis-engine/src/bandscope_analysis/separation/audio_separator.pyservices/analysis-engine/tests/test_audio_separator.pyservices/analysis-engine/tests/test_cli.pysupply-chain/supplemental-component-inventory.json
|
@coderabbitai resolve |
✅ Action performedComments resolved and changes approved. |
|
@coderabbitai resolve |
✅ Action performedComments resolved and changes approved. |
|
@coderabbitai resolve |
Rate Limit Exceeded
|
|
Updated #111 to fix the failing checks after the scheduler/develop merge. What changed:
Validation:
Security Notes:
|
|
@coderabbitai approve |
|
@coderabbitai review |
✅ Action performedReview finished.
|
✅ Action performedComments resolved and changes approved. |
ddf32ad to
d0853d8
Compare
|
Correction to the previous validation note: the final pushed dependency stack is Additional validation after the final lockfile:
Security Notes update:
|
|
@coderabbitai review |
|
@coderabbitai approve |
✅ Action performedReview finished.
|
✅ Action performedComments resolved and changes approved. |
|
Closing this PR instead of merging it. Current live state: #111 is approved but Concrete blockers:
Please reopen as a fresh, focused PR if stem separation is still desired: start from current |
Pull request was closed
Description
This PR integrates local-first source separation into the
bandscope-analysisengine. It utilizestorchandtorchaudio(demucs) to split ingested audio into discrete instrumental/vocal stems.Key Changes:
AudioStemSeparatorclass with device management (CUDA/MPS/CPU).cli.pyto support the--separate-stemsflag.test_audio_separator.pyandtest_cli.py.Resolves #106
Security Notes
gc.collect()and clearing cache.